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AMENDMENTS TO THE CLAIMS 

The Assignee submits below a complete listing of the current claims, including marked- 
up claims with insertions indicated by underlining and deletions indicated by strikeouts and/or 
double bracketing. This listing of claims replaces all prior versions, and listings, of claims in the 

application: 

1-13. (Canceled). 

14. (Previously presented) A system for providing transcription of a conference 
between a plurality of participants of the conference, the system comprising: 

a plurality of reception stages to receive information from the plurality of participants 
over a respective plurality of transmission channels; and 

at least one processor capable of receiving the information from the plurality of reception 
stages, the at least one processor programmed to: 

analyze the information received at the plurality of reception stages to determine 
which of the plurality of participants of the conference is speaking during a given time 
interval based, at least in part, on identifying which of the plurality of reception stages is 
receiving speech information; 

select one of the plurality of transmission channels corresponding to the reception 
stage identified as receiving speech information as an in-use channel; 

determine channel information including at least one transmission parameter of 
tiie in-use channel; 

extract at least one feature vector from the speech information based, at least in 
part, on the channel information; 

perform acoustic segmentation of the speech information to generate acoustic 
segmentation information indicating at least one segment identified in the speech 
information based, at least in part, on the channel information and the at least one feature 
vector, the acoustic segmentation information including a label for the at least one 
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segment of the speech information indicating whether the at least one segment is 
associated with speech, a pause in speech or non-speech; 

determine a language of the speech information based, at least in part, on the 
chaimel information, the at least one feature vector and the acoustic segmentation 
information; and 

generate text information corresponding to words recognized in the speech 
information based, at least in part, on the chaimel information, the at least one feature 
vector, the acoustic segmentation information and the language. 

1 5 . (Previously presented) The system of claim 1 4, wherein the plurality of reception 
stages include at least two of the following: 

at least one sound card installed in at least one computer, the sound card connected to at 
least one microphone; 

at least one connection adapted to receive at least one analog telephone line; 

at least one connection adapted to receive at least one digital telephone line; 

at least one connection adapted to receive at least one Integrated Services Digital 
Network (ISDN) telephone line; 

at least one connection adapted to receive at least one data network channel; and 

at least one connection adapted to receive a voice-over-intemet-protocol (VoIP) data 

stream. 

1 6. (Previously presented) The system of claim 1 5, wherein the channel information 
includes bandwidtii information of the in-use channel. 

1 7. (Currently amended) The system of claim 1 5, wherein the at least one processor 
is programmed to recognize at least one key word in the speech information based, at least in 

part, on the language of the speech information, and wherein [[the]] a speech recognizer provides 
the text information based, at least in part, on the at least one key word. 
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18. (Previously presented) The system of claim 1 7, wherein the at least one processor 
is prograirmied to recognize a speaker group associated with the speech information based, at 
least in part, on tiie channel information and the language of the speech information, and wherem 
the speech recognizer provides the text information based, at least in part, on the speaker group. 

1 9. (Currently amended) A method of providing transcription of a conference 
between a plurality of participants of the conference, the method comprising: 

receiving information over a plurality of transmission channels from the plurality of 
participants; 

analyzing the information received at the plurality of reception stages to determine which 
of the plurality of participants of the conference is speaking during a given time interval based, at 
least in part, on identifying which of the plurality of reception stages is receiving speech 

information; 

selecting one of the plurality of transmission channels corresponding to the reception 
stage identified as receiving speech information as an in-use channel; 

determining channel information including at least one transmission parameter [[of]] that 
identifies the in-use channel; 

extracting at least one feature vector from the speech information based, at least in part, 
on the channel information; 

performing acoustic segmentation of the speech information to generate acoustic 
segmentation information indicating at least one segment identified in the speech information 
based, at least in part, on the chaimel information and the at least one feature vector, the acoustic 
segmentation information including a label for the at least one segment of the speech information 
indicating whether the at least one segment is associated with speech, a pause in speech or non- 
speech; 

determining a language of the speech information based, at least in part, on the channel 
information, the at least one feature vector and the acoustic segmentation information; and 
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generating text information corresponding to words recognized in the speech information 
based, at least in part, on the channel information, the at least one feature vector, the acoustic 
segmentation information and the language of the speech information. 

20. (Previously presented) The method of claim 19, wherein receiving speech 
information over a plurality of transmission channels includes receiving speech information via 
at least two of the following: 

at least one soxmd card installed in at least one computer, the sound card connected to at 
least one microphone; 

at least one analog telephone line; 
at least one digital telephone line; 

at least one Integrated Services Digital Network (ISDN) telephone line; 

at least one data network channel; and 

at least one voice-over-intemet-protocol (VoIP) data stream. 

21 . (Previously presented) The method of claim 20, wherein the channel information 
includes bandwidth information of the in-use chaimel. 

22. (Previously presented) The method of claim 19, further comprising recognizing 

at least one key word in the speech information based, at least in part, on the language of the 
speech information, and providing the text information is based, at least in part, on the at least 
one key word. 

23 . (Previously presented) The method of claim 22, further comprising recognizing a 
speaker group associated with the speech information based, at least in part, on the channel 
information and the language of the speech information, and wherein providing the text 
information is based, at least in part, on the speaker group. 
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24. (Currently amended) A computer readable storage device encoded with a 
plurality of instructions for execution on at least one processor, the plurality of instructions, 
when executed on the at least one processor, performing a method of providing transcription of a 
conference between a plurality of participants of the conference, the method comprising: 

receiving information over a plurality of ttansmission channels from the plurality of 
participants; 

analyzing the information received at the plurality of reception stages to determine which 
of the plurality of participants of the conference is speaking during a given time interval based, at 
least in part, on identifying which of the plurality of reception stages is receiving speech 
information; 

selecting one of the plurality of transmission channels corresponding to the reception 
stage identified as receiving speec h information as an in-use channel; 

determining channel information including at least one transmission parameter [[of]] tiiat 
identifies the in-use channel; 

extracting at least one feature vector from the speech information based, at least in part, 
on the channel information; 

performing acoustic segmentation of the speech information to generate acoustic 
segmentation information indicating at least one segment identified in the speech information 
based, at least in part, on the channel information and the at least one feature vector, the acoustic 
segmentation information including a label for the at least one segment of the speech information 
indicating whether the at least one segment is associated with speech, a pause in speech or non- 
speech; 

determining a language of the speech information based, at least in part, on the channel 
information, the at least one feature vector and the acoustic segmentation information; and 

generating text information corresponding to words recognized in the speech information 
based, at least in part, on the channel information, the at least one feature vector, the acoustic 
segmentation information and the language of the speech information. 
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25. (Previously presented) The computer readable storage device of claim 24, 
wherein receiving speech information over a plurality of transmission channels includes 
receiving speech information via at least two of the following: 

at least one sound card installed in at least one computer, the sound card connected to at 
least one microphone; 

at least one analog telephone line; 
at least one digital telephone line; 

at least one Integrated Services Digital Network (ISDN) telephone line; 

at least one data network channel; and 

at least one voice-over-intemet-protocol (VoIP) data stream. 

26. (Previously presented) The computer readable storage device of claim 25, 
wherein the channel information includes bandwidth information of the in-use channel. 

27. (Previously presented) The computer readable storage device of claim 24, further 
comprising recognizing at least one key word in the speech information based, at least in part, on 
the language of the speech information, and providing the text information is based, at least in 
part, on the at least one key word. 

28. (Previously presented) The computer readable storage device of claim 27, further 
comprising recognizing a speaker group associated with the speech information based, at least in 
part, on the channel information and the language of the speech information, and wherein 
providing the text information is based, at least m part, on the speaker group. 



